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Dissection of SARS Coronavirus Spike Protein into 
Discrete Folded Fragments. 
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Abstract: The spike protein of the severe acute respiratory syndrome coronavirus (SARS-CoV) mediates 
cell fusion by binding to target cell surface receptors. This paper reports a simple method for dissecting the 
viral protein and for searching for foldable fragments in a random but systematic manner. The method in- 
volves digestion by DNase | to generate a pool of short DNA segments, followed by an additional step of re- 
assembly of these segments to produce a library of DNA fragments with random ends but controllable 
lengths. To rapidly screen for discrete folded polypeptide fragments, the reassembled gene fragments were 
further cloned into a vector as N-terminal fusions to a folding reporter gene which was a variant of green 
fluorescent protein. Two foldable fragments were identified for the SARS-CoV spike protein, which coincide 
with various anti-SARS peptides derived from the hepated repeat (HR) region 2 of the spike protein. The 
method should be applicable to other viral proteins to isolate antigen or vaccine candidates, thus providing 


an alternative to the full-length proteins (subunits) or linear short peptides. 
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: delivery of the viral genome into the host cell. The 
Introduction spike protein (S protein) is a 180- to 200-kDa type-I 


: : transmembrane glycoprotein that is responsible for the 
Severe acute respiratory syndrome coronavirus 


(SARS-CoV), which is the causative agent of the 
atypical pneumonia, was first identified in the fall of 


initiation and propagation of infection by interacting 
with a cellular receptor to induce cell-to-cell fusion. 


Binding of the S protein to a specific soluble or cell 


ZOD ENO, (DE 2s Pig vaOu sy UNIO Wy MICMIDED Ome surface glycoprotein receptor induces global changes 


. ‘ (1 ot ond 
Lamy OF COronaWInUseS. is rape eee by in the conformation of S protein that displays a previ- 
means of aerosols and the high mortality rate (up to 
10%) make SARS a potential global threat. An at- 


tractive approach to interfere with SARS disease pro- 


ously hidden hydrophobic surface area which allows 
the virions to interact with the host cell membrane!*”), 
. poate ; Earlier attempted expressions of this S protein failed to 
eee LOCuseS ee of ine caries) inteenion eae obtain soluble full-length polypeptides in Escherichia 
esses by blocking the fusion process that mediates the Soin (E-cell), Subsequent wore aimedeat identinaine 


Received: 2005-12-31; revised: 2006-02-21 smaller but folded SARS-CoV spike fragments for use 
* Supported by the Tsinghua University SARS Special Fund and the as possible antigen or vaccine candidates. The ap- 
National Key Basic Research and Development (973) Program of proach involved digestion and reassembly of the target 


China (No. 2003CB716002) 
* * To whom correspondence should be addressed. 
E-mail: zhanglinlin@tsinghua.edu.cn; Tel: 86-10-62794403 


gene to generate a pool with smaller DNA fragments of 
random ends but controllable lengths which were 
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screened for foldable fragments using a green fluores- 
cent protein as a folding reporter'**!. Two foldable 
fragments were identified, which coincide with various 
SARS peptides reported to have SARS neutralization 
activity'**), This dissection approach has the potential 
to be a generally applicable tool for producing foldable 
fragments of viral surface proteins that may provide 
discontinuous epitopes. These fragments should be 
easier to express in E. coli or other recombinant hosts. 


1 Materials and Methods 
1.1 pET30a-linker-GFP construction 


The green fluorescent protein (GFP) gene was ampli- 
fied from an in-house GFP-containing vector pET30a- 
hydA (Wang and Lin, unpublished result), which was 
in turn constructed from pQB-2)), then ligated into the 
pET30a(+) (Novagen) to yield the pET30a-linker-GFP. 


1.2 Fragment library construction 


The SARS-CoV spike gene was obtained from the Huada 
Beying Genomics Institute. Fragmentation and _ re- 
assembly of the target gene were performed as described 
by Lorimer and Pastan'"!, The reassembled DNA sample 
was then purified and phosphorylated with T4 polynu- 
cleotide kinase at 37°C for 30 min. The backbone vector 
pET30a-linker-GFP was digested with EcoR I, blunt- 
ended with T4 DNA polymerase in the presence of 0.1 
mmol/L each dNTP, and purified with a QlIAgen® gel pu- 
rification kit to remove residual enzyme activity. The lin- 
earized and blunt-ended vector was then dephosphory- 
lated with shrimp alkaline phosphatase (SAP) followed 
by heat denaturation to deactivate the enzyme. The gene 
fragments and the backbone vector were ligated at 12°C 
overnight in the presence of 5% PEG8000 and then trans- 
formed into E.coli BL21(DE3) (Novagen) competent 
cells by electroporation. 


1.3 Screening of fragments 


Transformed E. coli BL21(DE3) cells were plated on 
Luria-Bertani (LB) medium supplemented with 50 
ug/mL kanamycin and grown overnight at 37°C, then 
grown further on a bench for about 20 h. The fluores- 
cent colonies were picked and tested with colony PCR 
by using primers flanking the fragment inserts, and se- 
quenced. No _ isopropylthio-B-D-galactoside (IPTG) 


was used in these experiments, as it would inhibit the 
formation of fluorescent colonies. 


1.4 Expression analysis of fusion proteins 


Saturated overnight cultures were diluted 100-fold into 
LB medium containing 50 ug/mL kanamycin and 
grown at 37°C for about 2 h to reach an optical density 
at 600 nm (OD¢00) of 0.5-0.6. Protein expression was 
initiated with 0.2 mmol/L of IPTG, and continued for 4 
h at 23°C. Cells were then collected and lysed for solu- 
ble protein extraction. The supernatant fractions (solu- 
ble protein) and cell pellets (insoluble protein) were re- 
solved by SDS-PAGE using a 12% acrylamide gel. 


2 Results and Discussion 


This work sought to identify smaller but folded SARS- 
CoV spike fragments for use as possible antigen or 
vaccine candidates. Compared with linear short pep- 
tides derived from the protein, folded fragments may 
be advantageous as they have the potential to provide 
discontinuous epitopes. The SARS-CoV spike gene 
was digested by DNase I to generate a pool of short 
DNA segments, followed by an additional step of reas- 
sembly of these segments to produce a library of DNA 


fragments with random ends"! This is in part 
2.131 but the 


purpose here is not to produce full-length hybrids from 


analogous to the DNA shuffling protocol! 


a group of different parental genes, but to generate 
various smaller DNA fragments from a single template 
gene. The reassembly step following the DNase I 
treatment is necessary to prepare a large number of 
DNA sequences with controlled lengths, which was 
achieved by tailoring the number of PCR cycles used 
in the reassembly (see Methods). To screen for discrete 
folded polypeptide fragments, the reassembled gene 
fragments were further cloned into a vector as N- 
terminal fusions to a folding reporter gene which was a 
variant of the green fluorescent protein that exhibits 
strong fluorescence upon UV excitation”! GFP has 
been shown to be an effective indicator for the fold- 
ability of the upstream polypeptide partner!*?!, The 
vector construction is shown in Fig. 1. 

Among about 4300 clones screened, 230 clones 
were found to be fluorescent (see Fig. 2a). These 
clones were then subjected to rapid colony PCR 
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CATATGIGGGAATTCTGCTGGCTCGAGTGCTGCTGGTTCTGGATCC [oaaccrtr 


Linker 


GFP 


Fig. 1 Expression construct (pET30a-linker-GFP). The sequence is flanked by the Nde I and Hind III sites which other- 
wise is identical to pET30a(+) (Novagen, Madison, WI). It contains a linker sequence GNSAGSSAAGSGS (boxed) up- 
stream of the GFP gene, and an internal EcoR I site (underlined) used for insertion of gene fragments. 


analysis. Many of the fluorescent clones were found to 
contain vectors with SARS spike gene fragments 
smaller than 100 base pairs (bp). In addition, some oth- 
ers (a total of 20) contained vectors with inserts in the 
reverse orientation or not in frame as indicated by se- 
quencing. The SDS-PAGE results showed that the pep- 
tides encoded by these gene inserts were degraded in the 
corresponding fusion proteins (data not shown). Finally, 
the two inserts larger than 150 nucleotides or 50 de- 
duced amino acid (aa) residues that were identified were 
ssPtu-15 (residues 1118-1175 of the original protein) 
and ssPtu-16 (residues 1129-1186). The expression of 
these fragments (in the GFP-fusion form) was further 
examined by SDS-PAGE. As shown in Figs. 2b and 2c, 
the fragments were partially soluble when expressed at 
23°C. Higher temperatures significantly reduced the 
amount of soluble protein (data not shown). 

Widely disparate virus families have been shown to 
contain two hepated repeat (HR) regions, which play a 
critical role in viral fusion with the target cell!*'5! OF. 
ten, one N-terminal HR region (HR1) is adjacent to the 
cell fusion peptide while a C-terminal HR region (HR2) 
is close to the transmembrane anchor. The SARS 
fragment ssPtu-15 isolated in this work overlaps with 
the HR2 (residues 1147-1185) of the SARS-CoV spike 
protein", while fragment ssPtu-16 contains the 
whole SARS HR2 (Fig. 3a). A hydrophobic cluster 
analysis!'”! of these two fragments showed two signifi- 
cant hydrophobic clusters (Figs. 3b, 3c, and 3d). Pre- 
sumably, these clusters play a role in the stability and 
oligomeric specificity of the HR2 structure!'®!, In addi- 
tion, compared with the wild-type sequence, both of 
the fragments contain a mutation at 1163 (K replaced 
by E). ssPtu-16 also contains a second mutation at 
1151 (from I to T), while ssPtu-15 contains a second 
mutation at 1157 (S substituted by Y). These mutations 
are likely a result from the fragment reassembly proc- 
ess. The mutation at 1163 seems to increase the helic- 
ity of the fragments. Interestingly, our more recent dis- 
section studies with other proteins rarely resulted in 
mutations. 


(a) Colonies obtained from inserting and expression of SARS- 
CoV spike gene fragments in pET30a-linker-GFP using E. coli 
BL21(DE3) as the host. 


ssPtu-15 — ssPtu-16 


Tre 


(b) Images of the supernatant fractions for the clones containing 
ssPtu-15, ssPtu-16, and GFP itself, with E . coli BL21(DE3) as 
control (denoted as “C”). All the pictures were taken under UV 


irradiation. 


\ 


(c) Coomassie brilliant blue-stained 12% acrylamide SDS-PAGE 
using E. coli BL21(DE3) as control (denoted as “C”). Calculated 
molecular masses for GFP, ssPtu-15, and ssPtu-16 were 29.9 kDa, 
36.3 kDa, and 36.3 kDa. Corresponding band positions are indi- 
cated by arrows. “s” indicates supernatants of lysates and “in” 
denotes insoluble pellets of the lysates. M: protein marker, broad 
range (NEB), whose bands were 175, 83, 62, 48, 33, 25, and 17 


kDa, respectively. 


Fig. 2 Expression of fusion proteins 
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See ceee coesceececccesoccesccosecose GINASVVNIQKE IDRLNEVAKNLNESLIDLQELGKYE - - - - 37 

or ee eee cee we were cece corccccen GDISGINASVVNIQKE IDRLNEVAKNLNESLIDLQELG - - - - - -- 38 
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SHR2-9 --- 60 
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(a) CLUSTALW alignment of ssPtu-15 and ssPtu-16 with HR2 derived peptides which interfere with SARS-CoV S-mediated fusion to host 
cells: peptide CP-1'!, peptides HR2, GST-HR2-38, GST-HR2-44"!, peptides SHR2-1, SHR2-2, SHR2-8, and SHR2-9!!, 


ssPtu-15 TVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINAYVVNIQEEIDRLNEVAKNL 


EE—---—-—-HHHHHHHHHHHH-—— 


<a ee EEEEEHHHHHHHHHHHHHH- 


ssPtu-16 SFKEELDKYFKNHTSPDVDLGDTSGINASVVNIQEETDRLNEVAKNLNESLIDLQELG 
—HHHHHHHHHHH--------—— HHH EEEE—H HHH HH HHH HH HHHHHWHH-— HHH 


(b) Secondary structure prediction for ssPtu-15 and ssPtu-16 by 3D-Jury (http://bioinfo.pl/Meta/)""*!. E, B-strand; H, a-helix. 


(c) 
(c) and (d) Hydrophobic clusters analysis (HCA) plot for ssPtu-15 and ssPtu-16 drawn using DRAWHCA (http://bioserv.rpbs jussieu.fr)!'”!, Pro- 
tein sequences are displayed on a duplicated helix using one-letter codes for the amino acids except for prolines (*), glycines (#), threonine (O), 


and serine (@]). Hydrophobic residues are automatically contoured. 


Fig. 3 Sequence analysis 


Several studies!**! have reported that SARS-CoV S- 
mediated fusion can be inhibited by HR2 but not HR1- 
derived peptides, most likely by interfering with the 
six-helix bundle formation, a process essential to drive 
the ee fusion reaction and to initiate infec- 
tion''*!, For the majority of these peptides, micromolar 
concentrations were required for efficient inhibition of 
the viral infection, indicating that although these pep- 
tides are effective, further optimization is required to 
achieve efficient inhibition of SARS-CoV in infected 
individuals. Given the high similarity of ssPtu-15 and 
ssPtu-16 with these peptides derived from the HR2 re- 
gion'®*!, ssPtu-15 and ssPtu-16 may both have poten- 
tial as therapeutic agents for the direct inhibition of 
SARS-CoV cell entry, as an anti-SARS vaccine, and as 
a high throughput assay for screening for small mole- 
cule inhibitors of SARS envelope-mediated cell fusion. 

In summary, the dissection approach described in 
this study has the potential to produce foldable 
fragments of viral surface proteins that may be use- 
ful for the design of antiviral compounds and provide 


alternative antigen or vaccine candidates. The method 
is target protein independent and thus can be applied to 
various viral proteins. The process is also simple and 
rapid. The method should be applicable for dissecting 
and understanding other non-viral proteins, for exam- 
ple, to identify smaller polypeptide units that are struc- 
turally, functionally, or evolutionally relevant. 
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